A partition-based Summary-Graph-Driven Method for Efficient RDF Query Processing
نویسندگان
چکیده
RDF query optimization is a challenging problem. Although considerable factors and their impacts on query efficiency have been investigated, this problem still needs further investigation. We identify that decomposing query into a series of light-weight operations is also effective in boosting query processing. Considering the linked nature of RDF data, the correlations among operations should be carefully handled. In this paper, we present SGDQ, a novel framework that features a partition-based Summary Graph Driven Query for efficient query processing. Basically, SGDQ partitions data and models partitions as a summary graph. A query is decomposed into subqueries that can be answered without inter-partition processing. The final results are derived by perform summary graph matching and join the results generated by all matched subqueries. In essence, SGDQ combines the merits of graph match processing and relational join-based query implementation. It intentionally avoids maintain huge intermediate results by organizing sub-query processing in a summary graph driven fashion. Our extensive evaluations show that SGDQ is an effective framework for efficient RDF query processing. Its query performance consistently outperforms the representative state-of-the-art systems.
منابع مشابه
Querying Distributed RDF Graphs: The Effects of Partitioning
Web-scale RDF datasets are increasingly processed using distributed RDF data stores built on top of a cluster of shared-nothing servers. Such systems critically rely on their data partitioning scheme and query answering scheme, the goal of which is to facilitate correct and efficient query processing. Existing data partitioning schemes are commonly based on hashing or graph partitioning techniq...
متن کاملHybrid Graph based Keyword Query Interpretation on RDF
Adopting keyword query interface to semantic search on RDF data can help users keep away from learning the SPARQL query syntax and understanding the complex and fast evolving data schema. The existing approaches are divided into two categories: instance-based approaches and schema-based approaches. The instance-based approaches relying on the original RDF graph can generate precise answers but ...
متن کاملBitMat: An In-core RDF Graph Store for Join Query Processing
With the growing size of RDF data sources, the need for a compact representation providing efficient query interface has become compelling. In this paper, we introduce BitMat, a main memory based compressed bit-matrix structure. The key aspects of BitMat are as follows: i) its RDF graph representation is very compact compared to the conventional disk-based and existing main-memory RDF stores, a...
متن کاملScaling Queries over Big RDF Graphs with Semantic Hash Partitioning
Massive volumes of big RDF data are growing beyond the performance capacity of conventional RDF data management systems operating on a single node. Applications using large RDF data demand efficient data partitioning solutions for supporting RDF data access on a cluster of compute nodes. In this paper we present a novel semantic hash partitioning approach and implement a Semantic HAsh Partition...
متن کاملgStore: Answering SPARQL Queries via Subgraph Matching
Due to the increasing use of RDF data, efficient processing of SPARQL queries over RDF datasets has become an important issue. However, existing solutions suffer from two limitations: 1) they cannot answer SPARQL queries with wildcards in a scalable manner; and 2) they cannot handle frequent updates in RDF repositories efficiently. Thus, most of them have to reprocess the dataset from scratch. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1510.07749 شماره
صفحات -
تاریخ انتشار 2015